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Processing Waveforms as Trees for 
Pattern Recognition 


Abstract 

\ 

Waveforms may be represented symbolically such that their 
underlying, global structural composition Is emphasized. One such symbolic 
representation Is the relational tree. The relational tree is a computer data 
structure that describes the relative size and placement of peaks and valleys 
In a waveform. Researchers have developed various distance measures 
which serve as tree metrics. A tree metric defines a tree space. We are 
able to cluster groups of trees by their proximity in tree space. 

Linear discriminants are used to reduce vector space dimensionality 
and to Improve cluster performance. A tree transformation operating on a 
regular tree language accomplishes this same goal in a tree space. Under 
certain restrictions, relational trees form a regular tree language. 

Combining these concepts yields a waveform recognition system. 

This system recognizes waveforms even when they have undergone a 
monotonlc transformation of the time axis. The system performs well with 
high signal to noise ratios, but further refinements are necessary for a 
working waveform Interpretation system 
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Introduction 


When classifying signals of one or two dimensions, it is sometimes only 
necessary to take global structural Information Into account. By structural 
Information, we mean the relative height and placement of peaks and valleys in 
the waveform. A new automatic waveform interpretation system has been 
developed which exploits waveform structure through syntactic 
representation and cluster analysis. In addition to drawing on previously 
developed concepts, we show how a symbolic space can be transformed to 
improve the perfomance of a clustering system. Simulations prove that such 
a recognition technique might be used to classify waveforms with few errors. 
This paper describes the components of the waveform recognition system and 
how they interact. 

In our symbolic recognition system, waveforms are represented 
syntactically and then grouped Into clusters which reflect some underlying 
structural similarity. Erich and Foith [1], and Lu (2J, have described 
hierarchical computer data structures to represent structural information 
for waveforms. Of these, the relational tree is particularly simple. We shall 
use the relational tree, In concert with traditional cluster analysis and formal 
tree automata theory, to construct the waveform recognition system. A tree 
clustering objective function will be introduced to assess clustering 
performance, and a tree transformation Is introduced to Improve this 
performance. The tree clustering system, Illustrated by simulations, 
compares favorably with numerical techniques when the circumstances are 
appropriate. 





The first two chapters of this report describe the building blocks of our 
waveform recognition system, i.e. the relational tree representation and 
traditional cluster analysis. Following that, Chapter 3 shows how previously 
developed tree automata theory can be applied to the problem of tree cluster 
optimization. We then introduce a tree clustering objective function to 
quantify cluster improvement. Lastly, in Chapters 4 and 5, the components 
are assembled into a total system and tested on noisy data. 

We shall begin by thoroughly describing various tree topologies for 
representing signal structure. There are three; the relational tree, the 
skeletal tree\ and the complete tree. Due to its simplicity, the relational 
tree is the most appropriate for this work. In order to provide a 
comprehensive overview, we shall also discuss the skeletal and complete 
trees. 
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Chapter 1 

Review of Tree Representations 
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1.1 The Relational Tree Representation 

Ehrlch and Foith first proposed representing a signal by a relational 
tree (RT) [11. The relational tree provides a two-dimensional description of a 
one-dimensional signal, it draws on the Intuitive notion of a waveform as a 
sequence of peaks and valleys. Without attributes attatched to Its nodes, the 
RT contains only Information about the relative sizes and placement of peaks 
and valleys in the signal. Attributes may be added to the nodes of the tree to 
supply Information such as the value of time (Independent variable) and 
amplitude. The RT is Insensitive to scaling on either axis. This is beneficial in 
Identifying signals that undergo this type of scaling. 

In order to discuss the concept of relational trees, we will adopt the 
following definitions: 

Definition 1: 

A waveform segment is a one-dimensional positive function of 
finite length containing a finite number of maximum and minimum 
points. In addition, its endpoints must be lower than any other 
point in the segment. 

Erich and Foith [1] define a peak and a valley as relations; 


Definition 2: 

(x/\y f) Is a peak Pj iff yf l y, V (x,y)| xf-b ixi xf+b 
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A 


(XjV.yjV) is a valley Vj iff y ^ i y, V (x,y)| xp-b ixi xft+b 
Definition 3: 

A waveform segment's left (right) peak is that portion of the 
segment bounded on the left (right) by the left (right) boundary of 
the segment, and on the right (left) by the lowest valley in the 
segment. 

Definition A: 

A tree Is a rooted, directed, acyclic graph with no more than one 
edge entering each node, and zero or more edges exiting each 
node. Nodes which have zero edges exiting are called terminal 
nodes. All other nodes are known as non-terminal. Each tree 
has one node with no edges entering it known as the root. A tree is 
binary If no more than two edaes exit any node. 

Definition 5; 

A relational tree representing a single peak segment 
consists of an isolated terminal node t§ = p. A segment S 

containing two or more peaks has an associated RT consisting of a 
root node o with two descendants t| and t?, written 
ts* o(t|,tg) 

where tj and tg are RT's describing the left and right peaks of S. 
Symbolically, 

tspj|o(tj,t2), i«0,l,... 

Each non-terminal node In an RT represents a valley in the waveform. 
Each terminal node represents a peak. The valleys are nested according to 
relative depth. The root node of the RT is chosen to represent the deepest 
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valley in the waveform. This divides the waveform into two segments; one to 
the right of this valley, and one to the left. The descendants of this node will 
be RT's describing each segment. Each root node is labeled by its dominant 
peak, i.e. the highest peak in either segment. The non-terminal descendants 
of any node represent the deepest valleys in the right and left segments. 
They are in turn labeled by their dominant peaks (see figure 1 a & b). If a 
segment contains only a peak and no valleys, it is represented by a terminal 
node and then labeled by that peak. 



Figure 1.1 a 
Waveform Segments 
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P2 (root) 





Figure 1.1 b 
The Relational Tree 


Erich and Foith list 8 properties of these trees [1j: 

1) The frontier of the tree is a left to right description of the 
waveform, 

2) If a peak label occurs at N nodes, then that segment has N-l 
subpeaks, 

3) Each parent node corresponds to a valley that separates two 
descendant peaks, 

4) Each parent node has the same label as the descendant having the 
largest vertical height, 

5) If a sequence of k valleys all have the same height, the 
corresponding node has k+ /descendants. If the left or right 






dominant peak of a valley is not unique, the label of its node 
consists of all peaks with maximum height in that segment, 

6) Although several nodes in the tree have the same name, their 
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occurrences are different. The attribute list attatched to the node 
contains specific information, 

7) RT’s partition the set of one-dimensional functions into equivalence 
classes, 

8) As one moves from frontier to root through nodes of the same 
label, relative peak heights will be strictly decreasing. 

It is property 7 that is of most interest for this report. The partitions 
may be viewed as clusters in a feature space. The nature of the relational 
tree structure allows us to classify functions based on that structure. 

Special cases occur when valleys or consecutive peaks within a 
segment have equal height. The convention is to represent these as n-tuples 
as described in property 5. If the resulting trees are to be recognisable by 
finite tree automata, the number of descendants from a node must be limited 
to a finite number of choices. Erich and Foith describe a modified relational 
tree implementation which imposes a binary topology on the resultant trees. 
A similar implementation will be adopted for this work. Peak or valley 
dominance in these cases will be decided on the basis of position. 

Although the relational tree is the structure adopted for this work, two 
more tree structures for representing waveforms should be of interest to 
the reader. 
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1.2 Alternate Tree Structures 

Researchers have defined alternate tree structures for describing 
waveforms {21. These more complex tree structures borrow from the 
relational tree concept. The results are trees whose topology represents 
time and amplitude information in addition to the relative placement of peaks 
and valleys. 

Cheng and Lu [2] expanded the relational tree structure to take into 
account amplitude and time information. These alternative tree 
representations reduce the amount of semantic information which must be 
stored at nodes. In the skeletal tree , the waveform amplitude is quantized. 
An interval is delineated when the waveform crosses a quantization level 
boundary. Each node represents a pair of these crossings (see figure 2). 


P2 



Figure 1.2 
The Skeletal Tree 
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Cheng and Lu also describe the "complete tree” representation. For a 
complete tree, the structure reflects both waveform amplitude and interval 
widths. A complete tree is constructed by superimposing a two-dimensional 
grid over the waveform. Interval boundaries are decided by time line 
crossings (see figure 1.3). 



Figure 1.3 
The Complete Tree 

Skeletal and complete trees possess the following properties [2]: 

1) The depth of a tree Is equal to the number of quantization levels, 

2) The leaves of the tree are peaks in the waveform. Peaks smaller 
than a quantization interval are not visible, 


3) The depth of a leaf is equal to the amplitude of its corresponding 
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4) The nearest common predecessor of two nodes corresponds to a 
valley on the waveform, 

5) A node represents an interval on a quantization level, 

In addition, complete trees have these properties: 

1) The subtree containing all the non-terminal nodes, and only non¬ 
terminal nodes, is the skeletal tree for that waveform, 

2) The original waveform can be reconstructed by tracing the leaf 
nodes from left to right. 

Cheng and Lu used the complete tree to correlate waveforms. This was 
done by matching nodes between the trees representing two signals. 

In order to perform operations on relational trees, we need a way to 
compare them. A number of techniques for calculating distances between 
trees have appeared in the literature. In the next section we will examine 
various author's approaches and assess their usefulness. 

1.3 Review of Tree Distance Measures 

The tree distance measures discussed here (2,4,5,6] all make use of he 
minimum number of some elementary transformation necessary to make 
one tree into another. The earlier tree distance measures [3], (4) rely on 
syntactic error correction schemes, whereas recently developed distances 
[5], (6] are independent of grammatical considerations. 

Lu and Fu [3] propose a structure-preserved-error-correcting-tree- 
automata (SPECTA). This type of distance measure perses a tree and 
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compares It to a given grammar. Differences between two trees are viewed 
as syntactic errors. The parser finds the error correcting transformation 
involving the smallest number of node substitutions. The distance between 
two trees is the smallest number of node substitutions required to 
transform one tree into another while maintaining predecessor descendant 
relationships. If such a transformation exists, it is symmetric and satisfies 
the triangle inequality. 

In another publication , Lu and Fu [4] developed a generalized error- 
correcting tree automata (6ECTA). This distance measure is similar to 
SPECTA, except that node insertion and deletion errors are allowed in 
addition to node substitutions. 

SPECTA and GECTA both require that the trees be described by some 
type of tree grammar. Neither distance measure is guaranteed to exist. 

Another tree-to-tree distance algorithm is descibed by Lu [5J. This 
distance measure requires no knowledge of a tree grammar and employs 
insertion, deletion, and substitution errors. The distance djds( K >0)* where « 

and 0 are trees, is defined as the minimum cost necessary to derive <x from 
0 such that: 

1) The predecessor-descendant relation does not change, 

2) Nodes in 0 do not split or merge, 

3) The sequence of postfix ordering does not change after 
transformation. 

Lu shows [6] that can sometimes lead to anomalous values. 

For this reason, a new tree matching algorithm, d S m(«,0), was proposed to 
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make use of node splitting and merging. The basic node operations in 
damC*’#) are defined as father-son splitting, brother splitting, father-son 
merging, and brother merging. The distance d sm («, 0 ) is the minimum 

number of these four operations necessary to derive « from 0 . This 
distance always exists and obeys the following properties [ 6 ]: 

1 ) d(«,«) - 0 , 

2 ) d(«, 0 ) ■ d( 0 ,«), 

3) d(<x, 0 ) i d(«,if) ♦ d(T, 0 ). 

The next chapter provides a review of clustering as it has been used to 
recognize patterns which can be described by a feature vector. We shall 
focus on the specific techniques which are of use in clustering trees. 






Chapter 2 

Review of Feature Space Clustering 

2.1 Traditional Cluster Seeking Techniques 

There are many techniques available for partitioning a set of samples Into 
clusters [8,9]. The non-parametrlc algorithms rely on a distance measure 
between samples. Instead of the Euclidian distance between vectors, we can 
subsltute an appropriate tree distance. These techniques include the k-means 
algorithm and hierarchical clustering. 

The k-means algorithm [9] clusters samples by minimizing the sum of 
squared distances from cluster members to the cluster center. There is no 
proof of convergence for this algorithm. Its success depends on the value 
chosen for k, and the Initial cluster configuration. If this algorithm is to be 
used for trees, the definition of a tree cluster center Is required. Finding a 
cluster center Is an awkward and time consuming problem. 

Hierarchical clustering algorithms are best suited for clustering RT’s. 

They are best in the sense that no vector operations (such as those required 
to define a cluster center) are required except distance between samples. 

The result of an hierarchical clustering procedure Is a tree known as a 
dendrogram [8]. The frontier of the tree Is made up of clusters containing a 
single sample, and the root represents the entire sample set. A parent node 
represents a cluster which Is the union of the clusters represented by Its 
immediate descendants. Clusters are grouped according to some distance 
criterion such as nearest-neighbor. A typical two-dimensional sample set 








and its corresponding dendrogram are shown in figure 1.4. The samples are 
shown grouped according to nearest neighbor criteria. 



Figure 1.4 a 

The Entire Sample Space 


Level: 5 
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Figure 1.4 b 

The Hierarchical Clustering Dendrogram 
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2.2 Syntactic/Semantic Clustering Techniques 

Lu and Fu [7] proposed a cluster-seeking procedure for syntactic patterns. 

A grammar is inferred for each cluster in a sample set. If the distance from 
a sample to any cluster is above an arbitrary threshold, a new cluster is 
started. Otherwise, the sample is assigned to the nearest cluster and a new 
grammar is inferred for that cluster to account for the new sample. This is a 
very time consuming process since several new grammars must be infered 
each time a new sample is added to the set. Given a tree metric, any of the 
established minimum-distance clustering and classification techniques could 
be used. Lu [5] used a tree metric involving node substitutions, insertions, 
and deletions to hierarchically cluster handwritten characters. No attempt 
was made by Fu to evaluate or optimize clusters which were formed. 

2.3 Sample Classification 

Once the data are clustered, incoming samples may be classified according 
to a minimum distance criterion. This usually means assigning a sample to 
the cluster of its nearest neighbor. A variation on this is the k-nearest 
neighbor technique which assigns a sample to the cluster where a plurality of 
its k nearest neighbors lies. It may be that the resulting clusters are not 
ideal for classifying incoming samples. Individual clusters might be too 
scattered, or adjacent clusters too close together, or both. When this 
happens, we would like some way of modifying the clustering scheme so that 
each cluster contains the same samples, but with new clusters which are 
compact and well-separated (CWS). To this end, a criterion is needed which 
will indicate whether or not the clusters are CWS. The following sections 
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describe a transform method for modifying tree clustering* and an objective 
function for assessing the performance of such a transform. 







Chapter 3 

A New Tree Clustering System 
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3.1 Problem Statement 

First consider the case where waveform segments are uncontaminated 
by noise. The problem we wish to address here involves a given a set of 
signals of known origin with similar peak and valley structures. These signals 
may have undergone non-uniform stretching or squeezing along the domain 
axis, but they can still be grouped together. Given another signal of unknown 
type, how can we determine whether or not the unknown signal belongs to the 
previously known set? 

This paper proposes a solution to this problem which makes use of 
relational tree structures and traditional pattern recognition. The following 
sections show that the concepts of formal tree automata theory may be 
combined with the data structures and algorithms introduced in the previous 
chapters. This combination will produce a waveform recognition system which 
is insensitive to monotonic transformations of the horizontal axis. 

3.2 Clustering Relational Trees 

The previous section stated that relational trees partition the set of 
waveform segments into equivalence classes. If this is so, relational trees 
might provide a mechanism for classifying one-dimensional signals. Such 
signals include image scan lines, speech, or ECG records. Furthermore, a 
tree distance can be utilized to compare trees which are not identical. While 
there are ways to classify waveforms without resorting to syntactic/semantic 






techniques, the specie) properties of relations) trees may be exploited when 
certain types of signals are encountered. 

When identifying patterns which can be described by some feature vector, 
this problem is commonly solved by cluster analysis. There is a large body of 
work pertaining to clustering patterns in an Euclidian vector space. Many of 
the published algorithms make use of matrix operations in this vector space. 
Since the samples considered here exist in a tree space, the only operation 
available is some prespecified distance function. This makes most traditional 
cluster analysis techniques less than useful. It is possible, however, to use 
the same concepts while modifying the specific technique somewhat. The 
methods discussed here are those that will have an application to clustering 
relational trees. 

A relational tree space in which clusters are formed may be thought of as 
a directed graph. Each node represents a different tree. An edge exiting a 
node represents an elementary operation on that tree. The distance between 
two trees is the minimum path length on the directed graph from one tree to 
another tree. The subspace consisting of trees T1-T6 with metric djd(x,y) is 

shown in figure 3.1. Path "a" has length four, whereas path "b" has length two. 
The distance between trees T1 and T3 is therefore two edge traversals. 







id 



Path a Path b 


Figure 3.1 
A Tree Subspace 

Since the path from one tree to another is dependent on the particular type 
of elementary tree operation chosen, it is clear that the. tree distance 
algorithm selected will determine the nature of clusters formed. 

3.3 Theorems Concerning Relational Trees 
In order to apply the concepts of tree automata theory to waveforms 
represented as relational trees, we have to formally assert certain facts 
about them. Among these facts are the existence of a relational tree, and if 
it can be recognized by a tree automaton. 

Existence theorem: 

For all one-dimensional, unipolar, bounded functions with a finite 
number of maxima and minima in a finite interval, there exists a relational 
tree description for that interval. 








5 % * 
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Proof: 

Construct a tree by induction over M, the number of peaks in the 

segment. 

Base case: M-l. This is a segment consisting of only one peak. Its 
relational tree desciption is a single node tj=p. 

Inductive Hypothesis: M=m. A tree describing a segment with m 
peaks can be written t m =o(t m i,t m 2), where t m t contains mi peaks and t m 2 
contains m2 peaks, and mi+m2=m. 

Inductive Step: A new segment with m+! peaks is formed by adding 
a peak p m4 .j and a new valley on to the left of the segment with m peaks. The 
relational tree description t m+ j can be described recursively as follows: 

If the new valley is lower than any of the valleys in tm, then 

else 

tm*l = °(tmlttm2+l)* 

Corollary: 

A relational tree representing a waveform segment with M maxima will 
have 2M+1 nodes. 

If a tree is recognizable by a finite tree automaton, it is possible to 
perform certain operations on that tree via finite state machines, later in 
this paper, we will want to transform relational trees according to some 
predefined node operations. The effects of such a transformation are well 
known if the tree being operated upon is recognizable. 










Recognizability Theorem: 
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If relational tree descriptions are limited to binary trees as described 
in Erich and Foith's paper [Ij, then the forest of relational trees is a subset 
of those 2 1 X trees which are recognizeable by a finite tree automaton. 
Proof: 

Prove by constructing a regular tree grammar for describing the 
relational trees. Since a 2x-forest T(G) is formed by a regular tree 
grammar G, and TcRec(2,X) for every ^X-forest, the resulting trees will be 
recognizable. The grammar G is as follows: 

G= ! (N,Lx,P,ao), where: 

N = {a}, a non-terminal alphabet 
2 = 7.2 = {a}, a ranked alphabet 

X » {p}, a terminal alphabet 

P » {a->p, a->o(a,a)}, a set of productions 
ao = a, a starting symbol 

Example: 

A leftmost derivation of the relational tree shown in figure 1.1b is: 
a -> o(p1,a) «> o(p,o(a,a)) -> o(p,o(o(a,a),a» •> o(p,o(o(a,a),a)) 

->* o(p,o(o(p,p),p) 

If the nodes are labeled as shown in figure 1.1, the tree grammar 


becomes context-sensitive and is no longer recognizable. We use the term 
context sensitive because a the assignment of a sequential peak label requires 
knowledge of the surrounding peak labels which have already been fixed. To 







avoid such context sensitivity, we will adopt a node labeling scheme which is 
based on peak height rather than sequence. 

3.4 Tree Cluster Improvement: 

We wish to find a transformation T(xi x£X) on the set of sample trees X 
such that the probability of sample misclassification is minimized. In order to 
do this, we will define an objective function which is to be maximized. A tree 
transform mechanism and two methods for finding the objective function, 
Jj(X,T) and J 2 (X,T), are presented in the next section. 

3.5 A Tree Transformation 

In traditional feature space clustering, the objective function is often 

optimized by finding an appropriate linear transformation on the feature 

space. This can be reduced to a simple unconstrained minimization problem. 

Since we lack such tools as matrix multiplication when dealing with trees, 

finding the proper transformation to improve cluster separation becomes a 

search problem. A tree transformation is based on a tree tranducer. The 

following formal definition of a tree transformation lays the foundation for 

ameliorating clusters in a tree space. A frontier-to-root tree transducer U 

is a seven-tuple [II]: 

U=(2» X, A, 0, Y, P, A’) 

2, 0 are ranked alphabets 
X, Y are frontier alphabets 
A is a state set 
A' Is a set of final states 
P is a finite set of productions or rewriting 
rules of the following type 
i) (x=>a(q) I (x£X, a€A, qeF(j(Y))} 
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ii) o(a 1 (S 1 ),.. am(Sm)) <-> 
a(q(S 1 ,..Sm)). 
o e 2 m, miO, 
al, « • ,3m, a £ A, 

q(S 1 .Sm) 6 F(Y U Hm), 

Hm « {SI,..Sn) ■ a set of auxiliary 
variables that indicate the occurance of a subtree in a tree. 

Ty, then, is the tree transformation induced by U: 

Ttf {<P,q) 1 pCF^tX), qCFjj(Y), p=>*aq, aCA'} 

The rewriting rules are the variables of the transformation. A 
transformation is tailored to a specific aplication by choosing th proper rules. 
These rules represent mappings between subtrees in the input forest, F5OO, 
and the output forest, F fl (Y). States are placed in the tree during 
transformation to guide the rewriting rule application. 

Example of a Tree Transformation 

U » ( 2 , {x}, {ap, a,}, 0 , {y}, P, {a 0 }) 

2=2z*{o), Q=Q,={u») 

P: x»>a,y, o(aj(S|),a|(S|))=>apw(S|) 



A more complex and useful tree transform example is shown in the next 
chapter. 
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3.6 A Tree Clustering Objective Function 

Once the data undergo a transformation, the effect on cluster compactness 
and separateness must be assesed. Both objective functions presented here 
will utilize the ratio of between-class scatter to within-class scatter. 

Method 1 

For the two class problem, we define the within-class scatter for a 
transformed cluster of size |Y|| as: 

1 

s* = — 2 2 2 <*<y,x) 

|Yi| y€Yi x€Yi 

and the between-class scatter as, 

1 

5 B = - 2 2 d(x,,X2) 

(|xi| jxz|) xiexix2exz 

Xj and Yj are transformed clusters of trees. |Xj| is the number of sample 
trees in Xj. d(x,y) is some metric between trees. The objective function for 
the two class case is: 

SB 

J,(X,T) = - 

(sj ♦ S 2 > 







In order to generalize to c clusters, we must define the total within-class 
scatter as: 


1 

Sw -- 2 2 2 2 d(x,y) 

c|xi| xiexyexi xexi 


and the between-class scatter as: 

1 

Se = - 2 2 2 d(x„x) 

c|Xi|(|x|-|Xi|) Xiex xi€Xi x£X,x*Xi 

where X is the entire tree sample set after applying the transformation T. The 
objective to be maximized for the case where there are more than two 
clusters is: 


SB 

J|(X,T) * - 

Sw 

rifttfiod 2 

As an alternative method, consider the following definition of a cluster 

center for syntactic patterns [10]. 

First, define the p-metric for a tree xjj in cluster X\ to be: 


0ij * 0/|Xil>2l«lto|Xi| <#xjj,xj|) 
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Let xjj be the cluster center of Xf, denoted by mf, if its 0 -metric is 
minimum over the cluster, i.e. 

0 lj«min]{0tj | kk|Xf|} 

Using this definition of a cluster mean, let the wlthin-cluster scatter be 
the sum of p-metrlcs for each cluster center squared: 

c 

Sw = 2 £ij 2 , = x,-j f for all 1 

1=1 


Define the total mean m to be the median sample tree over the entire 
sample set: 

m = {x j 2 d(x,y) is minimum V xeX} 
yex.y-x 

The between-class scatter is: 

1 c 

S (3 “— 2 d(mj,m) 
c i=l 

Now maximize the objective function j£(X) - Sg/Syy using these 
expressions for Sg and Syy. 
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3.7 Transform Application 

Assuming the input and output alphabets of the transformation are fixed, 
an optimal transformation may be found over different combinations of 
rewriting rules. The transform designer searches for an optima) set of 
rewriting rules. The search is performed over a candidate rule set. Candidate 
rules are application specific and determined a priori. In the following 
chapter we show how such rules are formulated. In a step by step fashion, the 
waveform clssification algorithm is: 

1) Obtain a training set of waveforms, 

2) Convert to their relational tree representations, 

3) Cluster these trees in a metric space (directed graph), 

4) Find a transformation on these trees such that the objective 
function described previously is minimized, 

5) Convert candidate waveforms to their relational tree 
representation, 

6 ) Apply the transformation found above to the candidate trees, 

7) Classify the candidate trees by a nearest neighbor technique. 

The block diagram shown in figure 3.2 depicts such a waveform 


classification scheme. 


















Chapter 4 

Tree Transform Implementation 


The examples of structures and algorithms discussed in the preceding 
sections represent simple and ideal cases. When adapting these techniques to 
actual problems, some additional constraints and modifications are 
necessary. Specifically, we need to discuss relational tree labeling, and the 
details of the transform and optimization algorithms. 

4.1 RT Node Labels 

The node labeling convention shown in figure 4.1 is to identify a peak by 
its relative sequential postltlon. This is unsatisfactory when small errors in 
segmentation occur. Two waveforms may be structurally similar, but adding a 
single unwanted peak to the front of one waveform will cause Its entire 
relational tree to be labeled differently from the other. This leads to a 
falsely large distance between the two waveforms. 

n 

n 


Figure 4.1 

Sequential peak labels 
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The node labeling convention adopted for this research is to label peaks 
by their relative size within the waveform segment (see figure 4.2). Peak 
heights are scaled and quantized to L levels. When labeled in this manner, all 
relational trees will have root label P(__|. The smallest peak in a segment will 
have label Pq. This labeling scheme leads to non-unique labels for peaks. 



Modified labeling scheme. 

Thts type of labeling requires a new and more complicated grammar. 
Such a grammar G for L peak quantization levels is as follows: 

G-(N,Lx,P,a 0 ) 

N * {<x n | n * 0, I, ..., l-l}. 

2 - {P n | n - I, 2, ..., L —1}. 

X - {p n l n-0, 1,..., L-l}. 

P ■ { <K n“ >p n(° < ni <x m^ 0<m<n I 
Pn^rn^n)* I 
Pn )• 

a 0 * {“L-l)* 

An example derivation is shown In figure 4.3. 
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Figure 4.3 

A sample relational tree derivation. 


4.2 Transform Implementation 

The purely theoretical transform example given in the previous 
chapter is not sufficiently complex for a practical application involving 
relational trees. Some correlation must exist between the transform 
rewriting rules and the effects those rules have on the underlying waveform. 
The user of a waveform classification system is not concerned with tree 
manipulations as long as clusters are formed that reflect structural 
similarities between waveforms. Rewriting rules cannot be considered as 
individuals due to the interaction amongst them. These rules may be grouped 
according to their combined effects on a waveform. A subset of states and 
rewriting rules which are intended to affect one particular subtree, will 







henceforth be referred to as an operation. As an example, consider the 
following transform: 

Locate and replace all occurrences of segments consisting of the two 
peaks P 4 , ps, with the single peak P 5 . The operation to perform this on a 

relational tree is the set of states Q: 

Q = {44*35} 

and rewriting rules P: 

P = (P4 => q4P4, p 5 => qsps, Ps^.qs) => qsps) 

Figure 4.4 shows the results of applying this transform to the tree 
P 5{P4» P 5{P4*P5))- 



figure 4.4 

A transform involving one operation. 
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4.3 Tree Transform Rule Selection 

Rules must be chosen so that they are consistent and non-ambiguous. 
For example, individual operations must not interfere with each other. It is 
also necessary that no two rules in a transform share the same left-hand 
side. This constraint prevents ambigous transforms. When transforming 
relational trees, the proper peak dominance relations must remain intact. To 
accomplish this, any relational tree transform must include rewriting rules 
that maintain peak dominance conventions when a node's descendants have 
been altered. These rules should relabel the node so that the proper peak is 
dominant. 

The tree transform algorithm as implemented here accepts input from 
the user in the form of states and rewriting rules. Beginning with the leaves 
of the tree, each node is compared to the left-hand sides of the rule list. If a 
match is detected, the proper action is taken. If no match is detected, the 
node is checked for peak dominance. This step frees the user from having to 
specify rules to maintain proper peak dominance. In practice, the system 
reserves two states: CHANGE and NOCHANGE. If no rule is matched at a node, 
and either the node is a leaf or all the states descendant from the node are 
NOCHANGE, then the subtree is propagated intact with state NOCHANGE. If one 
of the descendant states is CHANGE, peak dominance is checked and the 
subtree is propagated with state CHANGE. Care must still be taken that user 
specified rules preserve peak dominance. 





4.4 Search Algorithm 

The algorithm which searches for an optimum transform is a critical 
aspect of the classification scheme. This is a very time consuming procedure, 
and any modifications which reduce the algorithm's complexity will result in 
great cost savings. 

If there are n compatible operations, the result is a worst case search 
time complexity of O(M), where 

n-1 n! 

M = 2 - 

j=0 if(n-i)! 

different combinations of operations. 

Unless there is some way to predict the outcome of a tree-matching 
(tree distance) algorithm, J(X,T) must be computed for each combination of 
operations. If the algorithm can be implemented on parallel processors, the 
search time could be reduced. 

Tree distance computations are the most costly portion of the search 
algorithm. In the case where particular trees are known a priori to be the 
desired cluster centers, the number of distance computations nescessary at 

each search node may be reduce considerably by using the objective function 
J 2 (X,T). If there are n samples, the objective function Jj(X,T) requires T| tree 

distance computations for each combination of operations, where 
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With J 2 (X,T), only T 2 - n+c distance calculations are necessary, where c is 
the number of clusters. It is simple to show that for ail n greater than 5, T 2 < 
T|. Therefore, a search using the objective function J 2 (X,T) is faster than one 
utilizing Jj(X,T), so long as no cluster center computations are required. 



4.5 An Alternate Implementation 

A sequential language such as Fortran, C or Pascal is well suited for 
implementing the numerical computations one encounters in signal 
processing. Researchers are accustomed to thinking sequentially and the 
field of waveform recognition has grown up around this sequential and 
deterministic paradigm. So far, we have assumed the proposed relational 
tree waveform recognition system Is to be implemented in a traditional Von 
Neumann computer language. Since this is the “natural" way to program the 
current generation of computers, such an approach is probably appropriate. 
However, the operations involved in our waveform classification system are 
radically different from typical signal processing tasks. Our techniques are 
symbolic rather than numerical. They are reasoning rather than 
computational. It is awkward to express reasoning with a Von Neumann 
language. In addition, operations on trees need not be sequential. Instead of 
transforming a waveform by operating on one digitized sample at a time, as 
in conventional signal processing, tree transformations possess inherent 
parallelism. While computers are currently available to execute the tree 
transformation In parallel, resulting In considerable time savings, parallelism 
is difficult to express In a Von Neumann-type language. With these 
considerations In mind, a more convenient way to Implement our waveform 
classification system Is sought. 

Since we are trying to Imitate the human capacity for detecting structural 
similarities In waveforms, the relational tree classification system may be 
characterized by the catch-all phrase “Artificial Intelligence". Workers In 
this field have found two alternatives to Von Neumann languages that are well- 
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suited to expressing this type of computation. There are functional languages 
such as lisp, and logic programming languages such as Prolog. Lisp operates 
solely on lists. Prolog, however, has more flexible typing, making it easier to 
represent trees with greater than binary branching factor. Prolog is also 
attractive in that it operates in a similar fashion to our tree transformation. 
In Prolog, knowledge is represented descriptively in a database which may be 
easily modified, while a separate inference engine uses that knowledge in a 
methodical manner. The inference engine is the same for all applications. In 
our system, the tree transform rewriting rules are expressed as facts in a 
user modifiable text file. These facts change according to the application. 

The tree transform procedure, itself a type of inference engine, is never 
altered. Finally, Prolog seems the best choice for an implementation language 
since it is quite easy to express parallelism. Parallel implementations of 
Prolog, such as Parlog, already exist {12]. 

Although this paper is not intended as an introduction to Prolog, a brief 
description is given here, a more thorough treatment may be found in Clocksin 
and Melish [13]. Prolog is a system for doing inference using clausal logic. A 
Prolog program consists of rules and facts in the form of Horn clauses. 

These clauses reside in the Prolog database. The user presents queries to 
Prolog, also in the form of Horn clauses. These queries are known as goals. 
Prolog decides whether or not a goal clause is consistent with the rules and 
facts in the database, and instantiates whatever variables are required to 
make the goal true. The inference method used by Prolog to make these 
decisions is known as resolution. 
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The waveform recognition system presented here has been implemented in 
the programming language C. The tree transform program as written ia a 
complex procedure with many address comparisons and multiple indirections. 
The Prolog implementation of a tree transform is much simpler than its 
counterpart in a sequential language. In addition, the Prolog version has the 
flexiblity to work "backwards". For instance, one could instantiate the new 
tree and ask Prolog to find the tree transform which led to it. The suggested 
Prolog program is listed below. Tree nodes are represented by the function 
symbol node(Label, desclist), where desclist is the list of descendent nodes. 
Transform rules are in the database in the form: 

rule(lbl,statelist,new_label, newstate,desclist), 
where a node labeled Ibl with descendent states given by statelist is 
transformed to a node with label newJabel state new_state, and has 
descendants given by desclist. 

aodexformtold node, new state, new node) 

This predicate transforms a node and assigns it a state 

nodexform(node(Lbl,Ndlist),State,node(Newlbl,NewndlistZ)) 

xformchildrn(Ndlist,Newndlist1,States), 

ruledbl,States,Newlbl, State,Childlist), 

selectchildrn(States,Newndlistl,Chi1dlist,Newndlist2). 

^Qrmchlldrndlst of nodes, new li st of nodes, list of states) 

This predicate transforms a list of nodes 

xformchildrnd ], -,_) • 

xformchildrn((Node|Nlist],(NewnodelNewnlist],[State|S!ist]) 

nodexform(Node,State,Newnode), 

xformchildrn(Nlist,Newnlist,Slist). 

afilectchlldrnt state list, node list, child list, new node list) 

This predicate selects the new list of children for the transformed node 
given Clist from the transform rule. 
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sel*ctchitdrn(_ y _,[ ]»_). 

selectchUdrnasjSlistUNjNlistUaClistUNlNewnlist])S«=D. 
selectchi1drn(SIist,N1ist,[CiC1ist],Newntist) 

8 electchildrn(Slist,Nlist,Clist,Newnlist). 

The program listed above expresses the parallelism inherent in the tree 
transform algorithm. The descendants of any node may be processed 
simultaneously, because they are not dependent on each other In any way. 
if, in the predicate xformchildrnO, the elements of Nlist could be processed 
in parallel, the complexity of the algorithm would be reduced substantially. 








Chapter 5 
Applications 

In order to demonstrate the usefulness of the concepts and algorithms 
developed In the preceedlng chapters, we will apply them to some real 
problems encountered In waveform recognition. The areas examined will be 
reflection seismic exploration for stratigraphic anomalies, identification and 
classification of electrocardiogram abnormalities, and the interpretation of 
stereo Images. 

5.1 Review 

Let us reiterate the concepts which have led to the waveform 
recognition system. Just as patterns can be represented as strings over 
some grammar, waveforms can be described by a relational tree. Valley 
Information is stored at non-terminal tree nodes, and peak information Is 
stored at the leaves. A left-to-rlght traversal of the leaves yields a left-to- 
rlght description of the peaks In the waveform. An attractive feature of 
relational trees Is that they are Insensitive to scaling along the horizontal 
axis of the waveform . If certain restrictions are Imposed on relational 
trees, they may be represented by a tree grammar. The set of all relational 
trees is the forest of trees described by such a grammar. 

Just as the Levensteln distance is used as a metric between strings, 
there is a tree distances by which we can examine the similarity of two trees. 
Such a distance allows the relative positioning of relational trees in a tree 
space. A tree space is a directed graph where each relational tree occupies 




a unique node. The distance between two trees is the path along the graph 
from one to another. Using this concept, we can "cluster" relational trees in 
such a space. Hopefully, similar waveforms would generate relational trees 
which lie close to each other in tree space. Using the well-known techniques 
of cluster analysis, we can treat relational trees much as a vector in pattern 
space. This immediately provides us with two very useful operations. The 
first, cluster-seeking, may be used to segment a set of unknown waveforms 
into classes. These classes should reflect some kind of underlying structural 
similarity. The second, classification, allows us to recognize individual 
unknown waveforms as belonging to one of a number of known classes. 

Two criteria for assessing clustering performance have been 
introduced. They are intended to convey the degree to which clusters are 
compact and well-separated. Both rely on the ratio of between-class scatter 
to within-class scatter. The first requires many costly tree comparisons, but 
requires no knowledge of a cluster center. The second makes use of a 
prototype tree which serves as a cluster center. This reduces the complexity 
of the objective function computation substantially. 

The tree transformation may be used to improve cluster performance. 
A tree transformation maps the forest of relational trees to another forest 
which is a subset of the relational trees. The desired subset has clusters 
which are compact and well-separated. A tree transformation is sought which 
improves a given clustering criterion. The tree transformation relies on a 
set of node rewriting rules. It is by varying these rules that tree transforms 
are tailored to a specific application. 








A waveform recognition system which exploits the properties of 
relational trees might be constructed as follows: Training sets of waveforms 
are converted to relational trees and then transformed such that the clusters 
are compact and well-separated. Candidate waveforms are transformed using 
the same set of rewriting rules, and then classified based on their proximity 
to an existing cluster. 

5.2 Seismic Example 

Figure 5.1 shows a typical seismogram of a thin sand imbedded in a 

shale. 



Figure 5.1 

Seismogram over a sand channel (after Neidell (Hi) 


This particular stratigraphic configuration forms an oil trap and therefore 
has great economic significance. An expert seismic interpreter can easily 
spot such an anomaly. A wavelet with a single peak and a trough becomes a 
doublet over the sand lens. It is not so easy for a machine to make such a 
qualitative judgement. Oue to varying frequencies, noise contamination, 
changing bed thickness, and segmentation errors, a purely numerical 
algorithm technique, such as a matched filter, may not succeed in identifying 
those traces that contain sand. However, this is an ideal two-class relational 
tree clustering problem. 
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V * ^ V* 4 T ,' - 


A seismogram over a known sand lens will serve as a training set of 


waveforms. From the corresponding relational trees, a tree transform can 


be found which improves clustering performance based on the objective 


functions described earlier. Seismic traces from unknown geology may then 


be compared to the two clusters and classified as belonging to the cluster of 


their nearest k neighbors. Figure S.2 depicts schematically the two 


waveforms in question and their relational trees. 


No Sand Present 


Sand Present 


p5 pO 


pO p5 p5 pO 

Figure 5.2 

Seismic Waveforms and Relational Trees 


Variations of these trees will occur due to noise, varying geology, etc.. The 


tree transformation will be designed to eliminate these inhomogeneities as 


much as possible. The rates of successful classification will be compared 


before and after transforming the tree clustering space by simulating the 


waveforms, distorting them, and adding noise, then applying the above 


procedure. 


■ 


mm 







The number of peak quantization levels used was six. This provided 
reasonable identification of critical peaks, while keeping the tree grammar 
small enough to work with. The candidate rule set chosen for this application 
is shown in table 5.1. 


operation: 
pO => aO(pO); 
pi => aO(pO); 
pZ => aO(pO); 
p3 *> aO(pO); 
p4 => a4(p4); 
p5 => a5(p5); 

operation: 

PO(aA,aA) => aO(pO); 
P1(aA,aA) «> aO(pO); 
PZ(aA,aA) => aO(pO); 

operation: 

P3(a3,a0) => a8(P3(sO,sl)); 
P3(a0,a3) -> a8(P3(sl,sO)); 
P3(a8,aA) => a8(P3(sO)); 
P3(aA,a8) => a8(P3(sl)); 
P3(aA,aA) => a3(p3); 

operation: 

P4(a4,a0) => a7(P4(sO,sl)); 
P4(a0,a4) => a7(P4(s!,sO)); 
P4(a7,aA) -> a7(P4(sO)); 
P4(aA,a7) => a7(P4(sl)); 
P4(aA,aA) => a4(p4); 

operation: 

P5(a5,a0) *> a6(P5(sO,sl)); 
P5(a0,a5) => a6(P5(sl,sO)); 
P5(aA,a6) => a6(P5(sD); 
P5(a6,aA) -> a6(P5(sO)); 
P5(aA,aA) => a5(p5); 

operation: 

P6(a0,a6) => aC(P6(sl,sO)); 


operation: 







P6(a7,a6) -> aC(P6(sl,sO»; 
operation: 

P6(a8 t a6) => aC(P6(sl,sO)); 

Table 5.1 

Selected tree-transform operations 

The effect of operations 2 through 6 was to propagate a desired subtree once 
it is identified. The state labeled aA, which appears in the right hand side of 
the rules, is the wild card state. Since the rule list is scanned from the top 
down, rules containing this state can be viewed as a default situation. 

5.3 Seismic Classification 

Waveforms were classified via a k-nearest neighbor scheme [8]. 
Training sets consisted of five samples of each waveform. A value of k=3 was 
found to be effective for classification. A single nearest neighbor scheme 
was found to be ineffective because an outlier from an incorrect cluster may 
be the closest to a given waveform. 

5.4 Seismic Classification Results 

Waveforms were simulated using cubic splines. The contaminating 
noise was bandlimited. The resulting trees before and after transforming are 
shown in appendix. Table 5.2 gives the results of transformation and 
classification for various signal to noise ratios. Signal to noise ratio is 
listed, followed by the objective function Jj before transformation, the 

objective function J|* after transformation, and the observed error rate 
Pr[e]. The errors made were all misses, i.e. the classification of a sand 
trace as a trace without sand. 
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S/N 

J1 

J1* 

PrlcJ 


00 

00 

0 

20db 


1.11 

0 

lOdb 

0.16 

1.54 

0.125 

5db 

. 0.18 

1.38 

0.187 

Odb 

0.16 

0.94 

0.500 

-5db 

0.12 

0.88 

0.500 


Table 4.2 


S.S Seismic Segmentation 

Another useful clustering operation is one that divides a group of 
unknown patterns into clusters. A cluster-seeking system may be used to 
segment a seismic section by forming clusters of similar wavelet structure. 

A hierarchical clustering scheme such as the one discussed in previous 
chapters was utilized to form clusters from the simulated seismic data. The 
resulting dendrogram using nearest-neighbor clustering is shown in figure 
5.3. The tree samples used were from the training set with lOdb SNR. Samples 
0-4 were signals from traces without sand, while signals 5-9 were from 
traces over sand lenses. 











We shall now address the problem of interpreting electrocardiogram 
(EC6) waveforms. An ECG may be divided into three parts: The P wave, the 
QRS complex, and the T wave [15]. A normal ECG waveform is shown in figure 
5.4. 


R 



Q s 


Figure 5.4 
The Normal ECG 


We wish to classify a subset of ECG abnormalities resulting from 
supraventricular arrhythmias. These are characterized by changes in the P 
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wave. Specifically, we wish to determine whether a given EC6 exhibits an 
inverted P wave associated with premature atrial contraction (PAC), an absent 
P wave resulting from atrial tachycardia, or is normal.This is essentially a 
three-class detection problem. The waveforms in question are shown in 
figure 5.5. 



Abnormal ECG patterns (after Ganong [15]) 


5.7 ECG Classification 

As in the previous example, waveforms were simulated with additive 
colored gaussian noise. In order to capture the inhomogeneities encountered 
when interpreting ECG's in real situations, variations were allowed in PQ 

interval, QRS duration, and ST interval. The training set was then transformed 
to improve the objective function Jj*. The set of transform rewriting rules 

which was found to accomplish this is given in table 5.3: 


operation: 
pO => aO(pO); 
pi => al(pl); 
p2 => a2(p2); 
p3 => a3(p3); 
p4 -> a4(p4); 
p5 => a5(p5); 
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operation: 

P0(aA,aA) => aO(pO); 

operation: 

P1(aA,aA) »> al(pl); 

operation: 

P2(al,a2) -> a6(P2(sO,sl)); 
P2(a6,aA) => a6(P2(sO)); 
P2(aA,a6) => a6(P2(s1)); 
P2(aA,aA) => a2(p2); 

operation: 

P5(a0,a5) => a5(p5); 
P5(a5,a0) => a5(p3); 
P5(a1,a5) -> a7(P5(sO,sD); 
P5(a2,a5) => a7(P5(sO,s!)); 
P5(a7,aA) => a7(P5(sO»; 
P5(aA,a7) => a7(P5(s1)); 

operation: 

P5(aA,aA) a5(p5); 


Table 5.3 

ECG transform rules 

The unknown incoming ECG signals were transformed in the same way and 
classified by a k-nearest neighbor scheme with k®3. The resulting relational 
trees for examples of all three clusters before and after transform are 
shown in the appendix. 


5.8 ECG Results 

The results of implementing the ECG recognition system are shown in table 
5.4. The system performs well only at high signal to noise ratios. 
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S/N 

Jl 

Jl* 

PrCel 

20db 

0.198 

1.271 

0 

15db 

0.174 

0.478 

0.081 

lOdb 

0.159 

0.499 

0.33 

5db 

0.179 

0.302 

0.581 

Odb 

0.169 

0.326 

0.581 


Table 5.4 

ECG Classification Results 


5.9 ECG Cluster Seeking 

We may also use hierarchical cluster-seeking to segment a group of 
ECG waveforms. Application of a nearest-neighbor cluster seeking algorithm 
resulted in the dendrograms shown in figures 5.10 and 5.11. The lOdb training 
set was used. Figure 4.10 shows a two-class example where normal ecg's are 
separated from those with PAC. Samples 0-3 were normal, samples 4-7 
exhibited PAC. 

0 

I 

0-0 

J i 

0 _ 0 0 _ 0 

• • i i 

0-2 1_3 4_!_5 6 7 

Figure 5.10 

Hierachical clustering of normal and PAC ECG’s 
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The dendrogram for a three claas problem ia given in figure 5.1t. In this 
example, samples 0-3 were normal, samples 4-7 exhibited PAC, and 5-9 were 
tachicardic. 

0 

I 

0_0 

I I 

0_0 0_0 

I II « 

I III 

0 _ 0 0 _ 0 0 0 _ 0 

i : s i i : : 

0_1 2_3 5_7 II 4_6 8_9 10 

Figure 5.11 

Three-class ECG hierarchical clustering dendrogram 

5.10 Discussion of Results 

The relational trees shown in the appendix demonstrate the filtering 
capabilities of the tree transform. The fluctuations in basic wave shape are 
removed by propagating the desired subtrees and eliminating those subtrees 
judged as superfluous. By storing time and amplitude information as node 
attributes, waveforms may be reconstructed from the transformed trees. 

The results of this experiment can be compared to the existing numerical 
techniques. Since within each signal class there are infinitely many variations 
in the waveform, an infinite number of matched filters would be required to 
accurately represent the problem. Assuming the signal set could be limited to 
a finite number of possible forms, a bank of matched filters would perform 
better than the technique presented here at low SNR's, but not as well for high 
SNR’s. When the amplitude of the noise becomes large enough to distort the 
peak dominance relations in the tree representation, the method breaks down. 
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A further consideration is the complexity of the system. Banks of matched 
filters are difficult to implement, and require many fixed or floating point 
operations. The relational tree clustering technique, once the transform 
operations have been selected, requires only addressing operations and 
integer comparisons. 

If a prototype tree can be constructed, system complexity is reduced 
linearly, keeping the number of tree comparisons to a minimum. In addition, 
the tree transformation speeds up the tree distance calculations by the 
square of the number of nodes removed since the distance computation runs 
in 0(n2) time, where n is the number of nodes. 

Comparing tables 4.2 and 4.4, it becomes apparent that the performance 
in terms of classification error is much worse for the EC6 case than for the 
seismic case. The explanation for this lies in the increased complexity of the 
ECG waveforms over the seismic waveforms. The need to distinguish complex 
trees from one another, and to recognize and eliminate noise subtrees, calls 
for a carefully constructed set of tree transform rewriting rules. The 
effectiveness of the transformation is limited by the candidate rules supplied 
by the human user. A methodical rule-selection algorithm is worth 
investigating. Information for the automatic design of a transform rule set 
could be gathered as a byproduct of the distance calculation. Since the 
distance between two trees results from the best matching between individual 
nodes, a large amount of detail concerning the correlation between subtrees 
is simply being ignored. The precise method by which this additional 
information ia put to use needs to be the subject of further research. Linking 
the transform design to the tree matching process would not only simplify the 






rule selection, but would reduce the complexity of the entire system 
markedly. 

In sddition to improving the rule selection process, the rules 
themselves might be enhanced by taking semantic information into account. 
More information may be encoded in the rules annotatively. If the single 
attribute ‘valley height* was to be included at each non-terminal node in the 
relational tree, more intelligent filtering could take place. The decision to 
propogate a subtree as relavent information, or to classify it as noise and 
prune it, depends to some extent on the depth of the parent node's valley. It 
was this inability to distinguish absolute valley depths that limited the success 
of the CCG rule set. The inclusion of at least some semantic information is 
evidently crucial for the recognition of complex waveforms. The theory for 
handling semantic information in the tree transform needs to be thoroughly 
developed before any improvements can be made to the implementation. 








Chapter 6 
Conclusions 

We have endeavored to construct a system which will identify and 
classify waveforms based on their underlying structural similarities. An 
elegant and convenient means to represent a signal's structure, without 
regard to absolute magnitudes or precise timing, is the relational tree. The 
relational tree is a computer data structure that represents a waveform by 
the relative placement of peaks and valleys. 

We can treat the relational tree much as a vector in pattern space. 
Several pseudo-metrics have been developed for measuring distance between 
trees. Using a tree metric and many of the concepts from traditional cluster 
analysis, we have designed a waveform recognition system. A key element of 
the waveform recognition system is the tree transform. The reason for 
transforming the tree space is to improve the clustering configuration so 
that unknown candidate trees lie close to their prototypes in a sample training 
set. The actions of a tree transform are determined by the rewriting rules 
which are mappings between subtrees. By varying the rewriting rule set, a 
transform may be found which Improves tree clustering performance. We 
have introduced two objective functions for measuring that performance. 

They vary in action and complexity. 

After implementing the waveform recognition system and testing it on 
simulated reflection seismic and electrocardiogram data, the following 
observations may be made. 
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(1) The symbolic recognition system in its present form is only feasible 
if the tree complexity is low, i.e. the signal contains a small number of peaks 
and valleys. 

(2) For these waveforms, the classification error is equal to or better 
than numerical techniques at low signal to noise ratios, but abruptly becomes 
worse as relative peak and valley heights are altered by noise. 

(3) A system of practical complexity requires automated rule selection. 
Information needed to select these rules is available as a byproduct of the 
tree metric. 

(4) The transform rule set could be greatly enhanced by adding 
semantic information, but the theory governing such a transform has yet to be 
developed. 

The decision to use relational tree waveform recognition must be made 
on the merits of each individual set of waveforms. Waveforms whose 
relational tree structures are similar will be difficult to distinguish in tree 
space. The strength of the RT method becomes apparent if the relational tree 
structures of opposing clusters lie far apart in tree space, as in the seismic 
example. 






Appendix 


The following relational trees are taken from the seismic example with 
5db signal-to-noise ratio. The trees are shown before and after undergoing 
transformation with the rewriting rules in table 4.1. 


No Sand Present 

Before transform: 

0 


I 



I I 

5_2 2_1 


I 

5-3 

I 

34 


After transform: 
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Before transform: 


6 

I 



I I 

1_j0 3_1 

I I 

OJOQLI 


After transform: 


5 

I 





Sand Present 

Before transform: 


6 

I 







After transform 


Before transform 



After transform 
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